Goto

Collaborating Authors

 main challenge


DocHop-QA: Towards Multi-Hop Reasoning over Multimodal Document Collections

Park, Jiwon, Pyeon, Seohyun, Kim, Jinwoo, Cabal, Rina Carines, Ding, Yihao, Han, Soyeon Caren

arXiv.org Artificial Intelligence

Despite recent advances in large language models (LLMs), most QA benchmarks are still confined to single-paragraph or single-document settings, failing to capture the complexity of real-world information-seeking tasks. Practical QA often requires multi-hop reasoning over information distributed across multiple documents, modalities, and structural formats. Although prior datasets made progress in this area, they rely heavily on Wikipedia-based content and unimodal plain text, with shallow reasoning paths that typically produce brief phrase-level or single-sentence answers, thus limiting their realism and generalizability. We propose DocHop-QA, a large-scale benchmark comprising 11,379 QA instances for multimodal, multi-document, multi-hop question answering. Constructed from publicly available scientific documents sourced from PubMed, DocHop-QA is domain-agnostic and incorporates diverse information formats, including textual passages, tables, and structural layout cues. Unlike existing datasets, DocHop-QA does not rely on explicitly hyper-linked documents; instead, it supports open-ended reasoning through semantic similarity and layout-aware evidence synthesis. To scale realistic QA construction, we designed an LLM-driven pipeline grounded in 11 high-frequency scientific question concepts. We evaluated DocHop-QA through four tasks spanning structured index prediction, generative answering, and multimodal integration, reflecting both discriminative and generative paradigms. These tasks demonstrate DocHop-QA's capacity to support complex, multi-modal reasoning across multiple documents.


Pope Leo identifies AI as main challenge in first meeting with cardinals

Al Jazeera

Pope Leo XIV has held his first meeting with the world's cardinals since his election as the head of the Catholic Church, identifying artificial intelligence (AI) as one of the most crucial issues facing humanity. Leo, the first American pope, laid out a vision of his papacy at the Vatican on Saturday, telling the cardinals who elected him that AI poses challenges to defending "human dignity, justice and labour" – a view shared with his predecessor, the late Pope Francis. Explaining his choice of name, the pontiff said he identified with the late Leo XIII, who had defended workers' rights during his 1878-1903 papacy at the dawn of the industrial age, adding that "social teaching" was now needed in response to the modern-day revolution brought by AI. The late Pope Francis, who died last month, warned that AI risked turning human relations into mere algorithms and called for an international treaty to regulate it. Francis warned the Group of Seven industrialised nations last year that AI must remain human-centric, so that decisions about when to use weapons or even less-lethal tools would not fall to machines.


A Systematic Literature Review on Client Selection in Federated Learning

Smestad, Carl, Li, Jingyue

arXiv.org Artificial Intelligence

With the arising concerns of privacy within machine learning, federated learning (FL) was invented in 2017, in which the clients, such as mobile devices, train a model and send the update to the centralized server. Choosing clients randomly for FL can harm learning performance due to different reasons. Many studies have proposed approaches to address the challenges of client selection of FL. However, no systematic literature review (SLR) on this topic existed. This SLR investigates the state of the art of client selection in FL and answers the challenges, solutions, and metrics to evaluate the solutions. We systematically reviewed 47 primary studies. The main challenges found in client selection are heterogeneity, resource allocation, communication costs, and fairness. The client selection schemes aim to improve the original random selection algorithm by focusing on one or several of the aforementioned challenges. The most common metric used is testing accuracy versus communication rounds, as testing accuracy measures the successfulness of the learning and preferably in as few communication rounds as possible, as they are very expensive. Although several possible improvements can be made with the current state of client selection, the most beneficial ones are evaluating the impact of unsuccessful clients and gaining a more theoretical understanding of the impact of fairness in FL.


What Are the Four main challenges in Machine Learning?

#artificialintelligence

Machine Learning (ML) is a rapidly growing field that has the potential to revolutionize many aspects of society, including healthcare, finance, transportation, and entertainment. However, like any rapidly growing field, ML also faces significant challenges. In this article, I will discuss four of the main challenges in machine learning. The first challenge in ML is data quality and quantity. ML models require large amounts of high-quality data to learn and make accurate predictions.


HuBMAP + HPA -- Hacking the Human Body

#artificialintelligence

Our Winstars team has recently participated in a Kaggle competition. HuBMAP HPA -- Hacking the Human Body finished in 95th place with a bronze medal among 1175 contenders. In this paper, we would like to present our solution and highlight all the essential techniques used. A big part of the given solution can be carried over to other deep-learning tasks with little or no modifications. The paper is structured as follows: first, we briefly present the competition and its main challenges.


Cryptominer detection: a Machine Learning approach – Sysdig

#artificialintelligence

Cryptominers are one of the main cloud threats today. Miner attacks are low risk, low effort, and high reward for a financially motivated attacker. Moreover, this kind of malware can pass unnoticed because, with proper evasive techniques, they may not disrupt a company's business operations. Given all the possible elusive strategies, detecting cryptominers is a complex task, but machine learning could help to develop a robust detection algorithm. However, being able to assess the model performance in a reliable way is paramount.


A Survey on Participant Selection for Federated Learning in Mobile Networks

#artificialintelligence

Federated Learning (FL) is an efficient distributed machine learning paradigm that employs private datasets in a privacy-preserving manner. The main challenges of FL is that end devices usually possess various computation and communication capabilities and their training data are not independent and identically distributed (non-IID). Due to limited communication bandwidth and unstable availability of such devices in a mobile network, only a fraction of end devices (also referred to as the participants or clients in a FL process) can be selected in each round. Hence, it is of paramount importance to utilize an efficient participant selection scheme to maximize the performance of FL including final model accuracy and training time. In this paper, we provide a review of participant selection techniques for FL.


How Copying the Human Brain Could Make AI Smarter

#artificialintelligence

Artificial intelligence that mimics the human brain could result in smarter, more efficient computers, experts say. Nara Logics' new AI engine uses recent discoveries in neuroscience to replicate brain structure and function. The research is part of a decades-long quest to make computers that can "think" as well as or better than humans. Simulating brain function is one promising approach. "There are obvious benefits to copying what seems to work in biology and implementing them in machines to aid automated decision making in a broad spectrum of daily activities," Stephen T.C. Wong, a computer science professor at Houston Methodist Research Institute, said in an email interview. The uses for humanlike AI could range "from playing chess, recognizing faces, and trading stocks to making a medical diagnosis, driving autonomous vehicles, and engaging business negotiations or even legal litigation," he added.


GitHub - royorel/StyleSDF

#artificialintelligence

Training files will be released soon. StyleSDF is trained only on single-view RGB data. The 3D geometry is learned implicitly with an SDF-based volume renderer. We introduce a high resolution, 3D-consistent image and shape generation technique which we call StyleSDF. Our method is trained on single-view RGB data only, and stands on the shoulders of StyleGAN2 for image generation, while solving two main challenges in 3D-aware GANs: 1) high-resolution, view-consistent generation of the RGB images, and 2) detailed 3D shape.


Council Post: The Three Main Challenges Of AI Safety

#artificialintelligence

Omneky utilizes state-of-the-art deep learning to empower businesses to grow. Despite having the capability to add $15.7 trillion to the economy by 2030 and increase business productivity by 40%, AI has many technical complications. These problems threaten AI safety and create obstacles for companies and their users. The most dominant issues the system faces are the potential for data quality issues, corruption and debugging a new technology. Problems with data quality have the potential to have a large impact on the output of AI systems.